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CLAIMS 

What is claimed is: 



1 LA computer system comprising: 

2 a processor; and 

3 a storage device coupled to the processor and having stored therein an instruction, 

4 which when executed by the processor, causes the processor to at least, 

5 access a packed data operand having at least two portions of data elements; 

6 select a set of data elements from a portion of the packed data operand, the portion 

7 including at least two data elements; 

8 copy each data element of the selected set of data elements to specified data fields 

9 located in the corresponding portion of a destination operand. 

1 2. The computer system of claim 1 , wherein the packed data operand includes 

2 eight data elements and the processor selects a set of data elements from one of either the 

3 upper half or the lower half of the packed data operand. 

1 3. The computer system of claim 1 , wherein the storage device further 

2 comprises a packing device for packing integer data into the data elements. 

1 4. The computer system of claim 1, wherein the data elements are 16-bit data 

2 elements and wherein the data packed and destination operands are each 128-bit operands. 

1 5 . The computer system of claim 1 , wherein the data packed and destination 

2 operands are the same operand. 
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1 6. A computer-implemented method comprising: 

2 decoding a single instruction; 

3 in response to decoding the single instruction, accessing a packed data operand 

4 including at least two portions of data elements; 

5 selecting a set of data elements from a portion of the packed data operand, the 

6 portion including at least two data elements; 

7 copying each data element of the selected set of data elements to specified data 

8 fields located in the corresponding portion of a destination operand. 

1 7. The method of claim 6, wherein accessing a packed data operand including 

2 eight data elements and selecting a set of data elements from one of either the upper half or 

3 the lower half of the packed data operand. 

1 8 . The method of claim 6, further comprising packing integer data into the data 

2 elements. 

1 9. The method of claim 6, wherein the data elements are 16-bit data elements 

2 and wherein the data packed and destination operands are each 128-bit operands. 

1 10. The method of claim 6, wherein the data packed and destination operands are 

2 the same operand. 

1 1 1 . A computer-implemented method comprising: 

2 accessing data representative of a first three-dimensional image; 

3 altering the data using three-dimensional geometry to generate a second three- 

4 dimensional image, the method of altering at least including, 
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5 accessing a packed data operand having at least two portions of data elements; 

6 selecting a set of data elements from a portion of the packed data operand, the 

7 portion including at least two data elements; 

8 copying each data element of the selected set of data elements to specified data 

9 fields located in the corresponding portion of a destination operand; and 
10 displaying the second three-dimensional image. 

1 12. The method of claim 1 1 , wherein the method of altering includes the 

2 performance of a three-dimensional transformation. 

1 13. The method of claim 1 1, wherein accessing a packed data operand including 

2 eight data elements and selecting a set of data elements from one of either the upper half or 

3 the lower half of the packed data operand. 

1 14. The method of claim 1 1 , wherein the method of altering includes packing 

2 integer data into the data elements. 

1 15. The method of claim 1 1, wherein the data elements are 16-bit data elements 

2 and wherein the data packed and destination operands are each 128-bit operands. 

1 16. The method of claim 1 1 , wherein the data packed and destination operands 

2 are the same operand. 

1 17. A method for shuffling packed data elements comprising: 

2 decoding a single instruction specifying a source operand, a destination operand, and 

3 a field of control bits; 
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4 responsive to the single instruction and the field of control bits, generating a first 

5 portion of the destination operand comprised of data elements from the same portion of the 

6 source operand. 

1 18. The method of claim 1 7, wherein the portion is one of either the upper half 

2 or the lower half of the source and destination operands. 

1 19. The method of claiml 7, wherein the source operand and the destination 

2 operand are the same operand. 

1 20. A processor comprising: 

2 a decoder to decode at least one single instruction specifying a source and 

3 destination operands, and a field of control bits; and 

4 an execution unit, responsive to the field of control bits, to generate a first portion of 

5 the destination operand comprised of data elements from the same portion of the source 

6 operand. 

1 21. The processor of claim 20, wherein the portion is one of either the upper half 

2 or the lower half of the source and destination operands. 

1 22. The processor of claim 20, wherein the decoder is to decode a first 

2 instruction to shuffle 16-bit data elements from a first source operand of 128 bits to a first 

3 destination operand of 128 bits, a second instruction to shuffle data elements from the upper 

4 half of a second source operand to the upper half of a second destination operand, and a 

5 third instruction to shuffle data elements from the lower half of a third source operand to the 

6 lower half of a third destination operand. 
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1 23 . A program loaded into a computer readable medium comprising: 

2 a computer readable code to access a packed data operand having at least two 

3 portions of data elements; 

4 a computer readable code to select a portion of the packed data operand, the portion 

5 including at least two data elements; 

6 a computer readable code to select a set of data elements from the selected portion; 

7 a computer readable code to copy each data element of the selected set of data 

8 elements to specified data fields located in the corresponding portion of a destination 

9 operand. 

1 24. The program of claim 23, wherein the source operand and the destination 

2 operand are the same operand. 

1 25. The program of claim 23, wherein m = 2 and the computer readable code to 

2 select a set of data elements selects one of either the upper half or the lower half of the 

3 packed data operand. 

1 26. A processor-implemented method comprising: 

2 decoding a single instruction specifying a source operand of 128 bits, a destination 

3 operand of 128 bits, and a control word of eight bits; 

4 responsive to the single instruction and the control word, shuffling 16-bit data 

5 elements from the source operand to the destination operand. 

1 27. The method of claim 26, wherein the source operand and the destination 

2 operand are the same operand. 
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1 28. A processor comprising: 

2 a decoder to decode: 

3 a first instruction specifying a first source operand of 128 bits, a first 

4 destination operand of 128 bits, and a first control word of eight bits; 

5 a second instruction specifying a second source operand of 128 bits, a second 

6 destination operand of 128 bits, and a second control word of eight bits; 

7 a third instruction specifying a third source operand of 128 bits, a third 

8 destination operand of 128 bits, and a third control word of eight bits; and 

9 an execution unit, responsive to the first instruction and the first control word, to 

10 shuffle 16-bit data elements from the first source operand to the first destination operand; 

1 1 responsive to the second instruction and the second control word, to shuffle data elements 

12 from the upper half of the second source operand to the upper half of the second destination 

13 operand; responsive to the third instruction and the third control word, to shuffle data 

14 elements from the lower half of the third source operand to the lower half of the third 

1 5 destination operand. 

1 29. The processor of claim28, wherein a source operand and a destination 

2 operand are the same operand. 

1 30. The processor of claim 28, wherein the processor is comprised of either or 

2 both hardware and software components. 
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